13 research outputs found
Recommended from our members
Security, Privacy, and Transparency Guarantees for Machine Learning Systems
Machine learning (ML) is transforming a wide range of applications, promising to bring immense economic and social benefits. However, it also raises substantial security, privacy, and transparency challenges. ML workloads indeed push companies toward aggressive data collection and loose data access policies, placing troves of sensitive user information at risk if the company is hacked. ML also introduces new attack vectors, such as adversarial example attacks, which can completely nullify models’ accuracy under attack. Finally, ML models make complex data-driven decisions, which are opaque to the end-users, and difficult to inspect for programmers. In this dissertation we describe three systems we developed. Each system addresses a dimension of the previous challenges, by combining new practical systems techniques with rigorous theory to achieve a guaranteed level of protection, and make systems easier to understand. First we present Sage, a differentially private ML platform that enforces a meaningful protection semantic for the troves of personal information amassed by today’s companies. Second we describe PixelDP, a defense against adversarial examples that leverages differential privacy theory to provide a guaranteed level of accuracy under attack. Third we introduce Sunlight, a tool to enhance the transparency of opaque targeting services, using rigorous causal inference theory to explain targeting decisions to end-users
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
XRay: Enhancing the Web's Transparency with Differential Correlation
Today's Web services - such as Google, Amazon, and Facebook - leverage user
data for varied purposes, including personalizing recommendations, targeting
advertisements, and adjusting prices. At present, users have little insight
into how their data is being used. Hence, they cannot make informed choices
about the services they choose. To increase transparency, we developed XRay,
the first fine-grained, robust, and scalable personal data tracking system for
the Web. XRay predicts which data in an arbitrary Web account (such as emails,
searches, or viewed products) is being used to target which outputs (such as
ads, recommended products, or prices). XRay's core functions are service
agnostic and easy to instantiate for new services, and they can track data
within and across services. To make predictions independent of the audited
service, XRay relies on the following insight: by comparing outputs from
different accounts with similar, but not identical, subsets of data, one can
pinpoint targeting through correlation. We show both theoretically, and through
experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision
and recall by correlating data from a surprisingly small number of extra
accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security
Symposium (USENIX Security 14
SerpinB1 protects the mature neutrophil reserve in the bone marrow
Deficiency of the neutrophil protease inhibitor serpinB1 leads to a reduced pool of mobilizable neutrophils through decreased neutrophil survival in the bone marrow
Secondary Flow as a Mechanism for the Formation of Biofilm Streamers
In most environments, such as natural aquatic systems, bacteria are found predominantly in self-organized sessile communities known as biofilms. In the presence of a significant flow, mature multispecies biofilms often develop into long filamentous structures called streamers, which can greatly influence ecosystem processes by increasing transient storage and cycling of nutrients. However, the interplay between hydrodynamic stresses and streamer formation is still unclear. Here, we show that suspended thread-like biofilms steadily develop in zigzag microchannels with different radii of curvature. Numerical simulations of a low-Reynolds-number flow around these corners indicate the presence of a secondary vortical motion whose intensity is related to the bending angle of the turn. We demonstrate that the formation of streamers is directly proportional to the intensity of the secondary flow around the corners. In addition, we show that a model of an elastic filament in a two-dimensional corner flow is able to explain how the streamers can cross fluid streamlines and connect corners located at the opposite sides of the channel
The bilateral responsiveness between intestinal microbes and IgA
The immune system has developed strategies to maintain a homeostatic relationship with the resident microbiota. IgA is central in holding this relationship, as the most dominant immunoglobulin isotype at the mucosal surface of the intestine. Recent studies report a role for IgA in shaping the composition of the intestinal microbiota and exploit strategies to characterise IgA-binding bacteria for their inflammatory potential. We review these findings here, and place them in context of the current understanding of the range of microorganisms that contribute to the IgA repertoire and the pathways that determine the quality of the IgA response. We examine why only certain intestinal microbes are coated with IgA, and discuss how understanding the determinants of this specific responsiveness may provide insight into diseases associated with dysbiosis